19 research outputs found
Toric algebra of hypergraphs
The edges of any hypergraph parametrize a monomial algebra called the edge
subring of the hypergraph. We study presentation ideals of these edge subrings,
and describe their generators in terms of balanced walks on hypergraphs. Our
results generalize those for the defining ideals of edge subrings of graphs,
which are well-known in the commutative algebra community, and popular in the
algebraic statistics community. One of the motivations for studying toric
ideals of hypergraphs comes from algebraic statistics, where generators of the
toric ideal give a basis for random walks on fibers of the statistical model
specified by the hypergraph. Further, understanding the structure of the
generators gives insight into the model geometry.Comment: Section 3 is new: it explains connections to log-linear models in
algebraic statistics and to combinatorial discrepancy. Section 6 (open
problems) has been moderately revise
Random Sampling in Computational Algebra: Helly Numbers and Violator Spaces
This paper transfers a randomized algorithm, originally used in geometric
optimization, to computational problems in commutative algebra. We show that
Clarkson's sampling algorithm can be applied to two problems in computational
algebra: solving large-scale polynomial systems and finding small generating
sets of graded ideals. The cornerstone of our work is showing that the theory
of violator spaces of G\"artner et al.\ applies to polynomial ideal problems.
To show this, one utilizes a Helly-type result for algebraic varieties. The
resulting algorithms have expected runtime linear in the number of input
polynomials, making the ideas interesting for handling systems with very large
numbers of polynomials, but whose rank in the vector space of polynomials is
small (e.g., when the number of variables and degree is constant).Comment: Minor edits, added two references; results unchange
Goodness of fit for log-linear ERGMs
Many popular models from the networks literature can be viewed through a
common lens of contingency tables on network dyads, resulting in
\emph{log-linear ERGMs}: exponential family models for random graphs whose
sufficient statistics are linear on the dyads. We propose a new model in this
family, the \emph{-SBM}, which combines node and group effects common in
network formation mechanisms. In particular, it is a generalization of several
well-known ERGMs including the stochastic blockmodel for undirected graphs, the
degree-corrected version of it, and the directed model without group
structure.
We frame the problem of testing model fit for the log-linear ERGM class
through an exact conditional test whose -value can be approximated
efficiently in networks of both small and moderately large sizes. The sampling
methods we build rely on a dynamic adaptation of Markov bases. We use quick
estimation algorithms adapted from the contingency table literature and
effective sampling methods rooted in graph theory and algebraic statistics. The
performance and scalability of the method is demonstrated on two data sets from
biology: the connectome of \emph{C. elegans} and the interactome of
\emph{Arabidopsis thaliana}. These two networks -- a neuronal network and a
protein-protein interaction network -- have been popular examples in the
network science literature. Our work provides a model-based approach to
studying them
Strong Hanani-Tutte on the Projective Plane
If a graph can be drawn in the projective plane so that every two non-adjacent edges cross an even number of times, then the graph can be embedded in the projective plane
Gr\"obner Bases and Nullstellens\"atze for Graph-Coloring Ideals
We revisit a well-known family of polynomial ideals encoding the problem of
graph--colorability. Our paper describes how the inherent combinatorial
structure of the ideals implies several interesting algebraic properties.
Specifically, we provide lower bounds on the difficulty of computing Gr\"obner
bases and Nullstellensatz certificates for the coloring ideals of general
graphs. For chordal graphs, however, we explicitly describe a Gr\"obner basis
for the coloring ideal, and provide a polynomial-time algorithm.Comment: 16 page
Statistical models for cores decomposition of an undirected random graph
The -core decomposition is a widely studied summary statistic that
describes a graph's global connectivity structure. In this paper, we move
beyond using -core decomposition as a tool to summarize a graph and propose
using -core decomposition as a tool to model random graphs. We propose using
the shell distribution vector, a way of summarizing the decomposition, as a
sufficient statistic for a family of exponential random graph models. We study
the properties and behavior of the model family, implement a Markov chain Monte
Carlo algorithm for simulating graphs from the model, implement a direct
sampler from the set of graphs with a given shell distribution, and explore the
sampling distributions of some of the commonly used complementary statistics as
good candidates for heuristic model fitting. These algorithms provide first
fundamental steps necessary for solving the following problems: parameter
estimation in this ERGM, extending the model to its Bayesian relative, and
developing a rigorous methodology for testing goodness of fit of the model and
model selection. The methods are applied to a synthetic network as well as the
well-known Sampson monks dataset.Comment: Subsection 3.1 is new: `Sample space restriction and degeneracy of
real-world networks'. Several clarifying comments have been added. Discussion
now mentions 2 additional specific open problems. Bibliography updated. 25
pages (including appendix), ~10 figure